AITopics | end-to-end system

Collaborating Authors

end-to-end system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 04:56:46 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Authors propose an attentional neural network that integrates top-down cognitive bias and bottom-up feature extraction. When there is noise in the stimulus, it can resolve the confusion by setting the appropriate top-down bias. They apply their method for digit recognition on noisy images (random background and random image) as well as two overlaid digits and report high accuracies in par or better than state of the art. I found this result interesting but I am not quite sure of the novelty of the approach specially considering a volume of research in recurrent networks, Hopfield nets, predictive coding, etc.

bia and bottom-up feature extraction, bottom-up feature extraction, cc paperinformation reviewerinstruction, (10 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)

Add feedback

An End-to-End System for Culturally-Attuned Driving Feedback using a Dual-Component NLG Engine

Thompson, Iniakpokeikiye Peter, Dewei, Yi, Ehud, Reiter

arXiv.org Artificial IntelligenceSep-8-2025

This paper presents an end-to-end mobile system that delivers culturally-attuned safe driving feedback to drivers in Nigeria, a low-resource environment with significant infrastructural challenges. The core of the system is a novel dual-component Natural Language Generation (NLG) engine that provides both legally-grounded safety tips and persuasive, theory-driven behavioural reports. We describe the complete system architecture, including an automatic trip detection service, on-device behaviour analysis, and a sophisticated NLG pipeline that leverages a two-step reflection process to ensure high-quality feedback. The system also integrates a specialized machine learning model for detecting alcohol-influenced driving, a key local safety issue. The architecture is engineered for robustness against intermittent connectivity and noisy sensor data. A pilot deployment with 90 drivers demonstrates the viability of our approach, and initial results on detected unsafe behaviours are presented. This work provides a framework for applying data-to-text and AI systems to achieve social good.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2509.04478

Country:

Africa > Nigeria (0.51)
Europe > United Kingdom > Scotland (0.15)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.58)

Add feedback

HITSZ's End-To-End Speech Translation Systems Combining Sequence-to-Sequence Auto Speech Recognition Model and Indic Large Language Model for IWSLT 2025 in Indic Track

Wei, Xuchen, Wu, Yangxin, Zhang, Yaoyin, Liu, Henglyu, Chen, Kehai, Bai, Xuefeng, Zhang, Min

arXiv.org Artificial IntelligenceJul-29-2025

This paper presents HITSZ's submission for the IWSLT 2025 Indic track, focusing on speech-to-text translation (ST) for English-to-Indic and Indic-to-English language pairs. To enhance translation quality in this low-resource scenario, we propose an end-to-end system integrating the pre-trained Whisper automated speech recognition (ASR) model with Krutrim, an Indic-specialized large language model (LLM). Experimental results demonstrate that our end-to-end system achieved average BLEU scores of $28.88$ for English-to-Indic directions and $27.86$ for Indic-to-English directions. Furthermore, we investigated the Chain-of-Thought (CoT) method. While this method showed potential for significant translation quality improvements on successfully parsed outputs (e.g. a $13.84$ BLEU increase for Tamil-to-English), we observed challenges in ensuring the model consistently adheres to the required CoT output format.

large language model, natural language, translation, (18 more...)

arXiv.org Artificial Intelligence

2507.19616

Country:

Asia > China (0.29)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Universal Post-Processing Networks for Joint Optimization of Modules in Task-Oriented Dialogue Systems

Ohashi, Atsumoto, Higashinaka, Ryuichiro

arXiv.org Artificial IntelligenceFeb-2-2025

Post-processing networks (PPNs) are components that modify the outputs of arbitrary modules in task-oriented dialogue systems and are optimized using reinforcement learning (RL) to improve the overall task completion capability of the system. However, previous PPN-based approaches have been limited to handling only a subset of modules within a system, which poses a significant limitation in improving the system performance. In this study, we propose a joint optimization method for post-processing the outputs of all modules using universal post-processing networks (UniPPNs), which are language-model-based networks that can modify the outputs of arbitrary modules in a system as a sequence-transformation task. Moreover, our RL algorithm, which employs a module-level Markov decision process, enables fine-grained value and advantage estimation for each module, thereby stabilizing joint learning for post-processing the outputs of all modules. Through both simulation-based and human evaluation experiments using the MultiWOZ dataset, we demonstrated that UniPPN outperforms conventional PPNs in the task completion capability of task-oriented dialogue systems.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2502.00747

Country:

North America > United States (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Consumer Products & Services > Restaurants (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
(2 more...)

Add feedback

METDrive: Multi-modal End-to-end Autonomous Driving with Temporal Guidance

Guo, Ziang, Lin, Xinhao, Yagudin, Zakhar, Lykov, Artem, Wang, Yong, Li, Yanqiang, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceSep-19-2024

Abstract-- Multi-modal end-to-end autonomous driving has shown promising advancements in recent work. By embedding more modalities into end-to-end networks, the system's understanding of both static and dynamic aspects of the driving environment is enhanced, thereby improving the safety of autonomous driving. In this paper, we introduce METDrive, an end-to-end system that leverages temporal guidance from the embedded time series features of ego states, including rotation angles, steering, throttle signals, and waypoint vectors. The geometric features derived from perception sensor data and the time series features of ego state data jointly guide the waypoint prediction with the proposed temporal guidance loss function. We evaluated METDrive on the CARLA leaderboard's Longest6 benchmark, achieving a driving score of 70%, a route completion score of 94%, and an infraction score of 0.78.

autonomous driving, prediction, time sery feature, (14 more...)

arXiv.org Artificial Intelligence

2409.12667

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (0.94)
Automobiles & Trucks (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Integrating End-to-End and Modular Driving Approaches for Online Corner Case Detection in Autonomous Driving

Kaljavesi, Gemb, Su, Xiyan, Diermeyer, Frank

arXiv.org Artificial IntelligenceSep-2-2024

Online corner case detection is crucial for ensuring safety in autonomous driving vehicles. Current autonomous driving approaches can be categorized into modular approaches and end-to-end approaches. To leverage the advantages of both, we propose a method for online corner case detection that integrates an end-to-end approach into a modular system. The modular system takes over the primary driving task and the end-to-end network runs in parallel as a secondary one, the disagreement between the systems is then used for corner case detection. We implement this method on a real vehicle and evaluate it qualitatively. Our results demonstrate that end-to-end networks, known for their superior situational awareness, as secondary driving systems, can effectively contribute to corner case detection. These findings suggest that such an approach holds potential for enhancing the safety of autonomous vehicles.

corner case detection, detection, end-to-end system, (12 more...)

arXiv.org Artificial Intelligence

2409.01178

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(2 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

With Good MT There is No Need For End-to-End: A Case for Translate-then-Summarize Cross-lingual Summarization

Varab, Daniel, Hardmeier, Christian

arXiv.org Artificial IntelligenceAug-31-2024

Recent work has suggested that end-to-end system designs for cross-lingual summarization are competitive solutions that perform on par or even better than traditional pipelined designs. A closer look at the evidence reveals that this intuition is based on the results of only a handful of languages or using underpowered pipeline baselines. In this work, we compare these two paradigms for cross-lingual summarization on 39 source languages into English and show that a simple \textit{translate-then-summarize} pipeline design consistently outperforms even an end-to-end system with access to enormous amounts of parallel data. For languages where our pipeline model does not perform well, we show that system performance is highly correlated with publicly distributed BLEU scores, allowing practitioners to establish the feasibility of a language pair a priori. Contrary to recent publication trends, our result suggests that the combination of individual progress of monolingual summarization and translation tasks offers better performance than an end-to-end system, suggesting that end-to-end designs should be considered with care.

computational linguistic, proceedings, summarization, (14 more...)

arXiv.org Artificial Intelligence

2409.00414

Country:

North America > Dominican Republic (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(10 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Infusing Emotions into Task-oriented Dialogue Systems: Understanding, Management, and Generation

Feng, Shutong, Lin, Hsien-chin, Geishauser, Christian, Lubis, Nurul, van Niekerk, Carel, Heck, Michael, Ruppik, Benjamin, Vukovic, Renato, Gašić, Milica

arXiv.org Artificial IntelligenceAug-5-2024

Emotions are indispensable in human communication, but are often overlooked in task-oriented dialogue (ToD) modelling, where the task success is the primary focus. While existing works have explored user emotions or similar concepts in some ToD tasks, none has so far included emotion modelling into a fully-fledged ToD system nor conducted interaction with human or simulated users. In this work, we incorporate emotion into the complete ToD processing loop, involving understanding, management, and generation. To this end, we extend the EmoWOZ dataset (Feng et al., 2022) with system affective behaviour labels. Through interactive experimentation involving both simulated and human users, we demonstrate that our proposed framework significantly enhances the user's emotional experience as well as the task success.

emotion, operator, user emotion, (15 more...)

arXiv.org Artificial Intelligence

2408.02417

Country:

Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.70)
Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

RE-RecSys: An End-to-End system for recommending properties in Real-Estate domain

C, Venkatesh, Oberoi, Harshit, Goyal, Anil, Sikka, Nikhil

arXiv.org Artificial IntelligenceApr-25-2024

We propose an end-to-end real-estate recommendation system, RE-RecSys, which has been productionized in real-world industry setting. We categorize any user into 4 categories based on available historical data: i) cold-start users; ii) short-term users; iii) long-term users; and iv) short-long term users. For cold-start users, we propose a novel rule-based engine that is based on the popularity of locality and user preferences. For short-term users, we propose to use content-filtering model which recommends properties based on recent interactions of users. For long-term and short-long term users, we propose a novel combination of content and collaborative filtering based approach which can be easily productionized in the real-world scenario. Moreover, based on the conversion rate, we have designed a novel weighing scheme for different impressions done by users on the platform for the training of content and collaborative models. Finally, we show the efficiency of the proposed pipeline, RE-RecSys, on a real-world property and clickstream dataset collected from leading real-estate platform in India. We show that the proposed pipeline is deployable in real-world scenario with an average latency of <40 ms serving 1000 rpm.

interaction, platform, short-term user, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3632410.3632487

2404.16553

Country:

Asia > India > Karnataka > Bengaluru (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Banking & Finance > Real Estate (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data

Fathullah, Yassir, Wu, Chunyang, Lakomkin, Egor, Jia, Junteng, Shangguan, Yuan, Mahadeokar, Jay, Kalinli, Ozlem, Fuegen, Christian, Seltzer, Mike

arXiv.org Artificial IntelligenceNov-12-2023

In this work, we extend the instruction-tuned Llama-2 model with end-to-end general-purpose speech processing and reasoning abilities while maintaining the wide range of LLM capabilities, without using any carefully curated paired data. The resulting end-to-end model can utilize audio prompts as a replacement for text and sustain a conversation. Such a model also has extended cross-modal capabilities such as being able to perform speech question answering, speech translation, and audio summarization amongst many other closed and open-domain tasks. This is unlike prior approaches in speech, in which LLMs are extended to handle audio for a limited number of pre-designated tasks. Preliminary results show that our end-to-end approach is on par with or outperforms a cascaded system (speech recognizer + LLM) in terms of modeling the response to a prompt. Furthermore, unlike a cascade, our approach shows the ability to interchange text and audio modalities and utilize the prior context in a conversation to provide better results.

arxiv preprint arxiv, language model, llm, (11 more...)

arXiv.org Artificial Intelligence

2311.06753

Country:

Europe > Iceland > Capital Region > Reykjavik (0.05)
Asia > Middle East > Republic of Türkiye (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Government > Regional Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback